AITopics | continuous latent variable

Collaborating Authors

continuous latent variable

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Recursive Bayesian Networks: Generalising and Unifying Probabilistic Context-Free Grammars and Dynamic Bayesian Networks

Neural Information Processing SystemsDec-23-2025, 21:18:35 GMT

Probabilistic context-free grammars (PCFGs) and dynamic Bayesian networks (DBNs) are widely used sequence models with complementary strengths and limitations. While PCFGs allow for nested hierarchical dependencies (tree structures), their latent variables (non-terminal symbols) have to be discrete. In contrast, DBNs allow for continuous latent variables, but the dependencies are strictly sequential (chain structure). Therefore, neither can be applied if the latent variables are assumed to be continuous and also to have a nested hierarchical dependency structure. In this paper, we present Recursive Bayesian Networks (RBNs), which generalise and unify PCFGs and DBNs, combining their strengths and containing both as special cases. RBNs define a joint distribution over tree-structured Bayesian networks with discrete or continuous latent variables. The main challenge lies in performing joint inference over the exponential number of possible structures and the continuous variables. We provide two solutions: 1) For arbitrary RBNs, we generalise inside and outside probabilities from PCFGs to the mixed discrete-continuous case, which allows for maximum posterior estimates of the continuous latent variables via gradient descent, while marginalising over network structures.

bayesian network, recursive bayesian network, unifying probabilistic context-free grammar, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Distillation of a tractable model from the VQ-VAE

Hadžić, Armin, Papez, Milan, Pevný, Tomáš

arXiv.org Artificial IntelligenceSep-3-2025

Deep generative models with discrete latent space, such as the Vector-Quantized Variational Autoencoder (VQ-VAE), offer excellent data generation capabilities, but, due to the large size of their latent space, their probabilistic inference is deemed intractable. We demonstrate that the VQ-VAE can be distilled into a tractable model by selecting a subset of latent variables with high probabilities. This simple strategy is particularly efficient, especially if the VQ-VAE underutilizes its latent space, which is, indeed, very often the case. We frame the distilled model as a probabilistic circuit, and show that it preserves expressiveness of the VQ-VAE while providing tractable probabilistic inference. Experiments illustrate competitive performance in density estimation and conditional generation tasks, challenging the view of the VQ-VAE as an inherently intractable model.

artificial intelligence, latent space, machine learning, (18 more...)

arXiv.org Artificial Intelligence

2509.014

Country: Europe > Czechia (0.14)

Genre: Research Report (1.00)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Recursive Bayesian Networks: Generalising and Unifying Probabilistic Context-Free Grammars and Dynamic Bayesian Networks

Neural Information Processing SystemsOct-9-2024, 18:57:49 GMT

bayesian network, latent variable, recursive bayesian network, (8 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Grammars & Parsing (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (1.00)

Add feedback

Congratulations to the #ICLR2024 test of time and outstanding paper award winners

AIHubMay-8-2024, 09:15:45 GMT

The Twelfth International Conference on Learning Representations (ICLR) is taking place this week in Vienna, Austria. During the opening of the conference, the outstanding paper award winners, and honourable mentions, were announced. The conference organisers also introduced a new award for this year: the test of time award. This award honours a paper from 2013/2014 that the programme chairs judge to have had a lasting impact. Abstract: How can we perform efficient inference and learning in directed probabilistic models, in the presence of continuous latent variables with intractable posterior distributions, and large datasets?

artificial intelligence, machine learning, natural language, (17 more...)

AIHub

Country: Europe > Austria > Vienna (0.55)

Genre: Personal > Honors > Award (1.00)

Technology:

Information Technology > Artificial Intelligence > Natural Language (0.97)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.74)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.49)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.48)

Add feedback

Deep Switching State Space Model (DS$^3$M) for Nonlinear Time Series Forecasting with Regime Switching

Xu, Xiuqin, Peng, Hanqiu, Chen, Ying

arXiv.org Artificial IntelligenceOct-8-2023

Modern time series data often display complex nonlinear dependencies along with irregular regime-switching behaviors. These features present technical challenges in modeling, inference, and in offering insightful understanding into the underlying stochastic phenomena. To tackle these challenges, we introduce a novel modeling framework known as the Deep Switching State Space Model (DS$^3$M). This framework is engineered to make accurate forecasts for such time series while adeptly identifying the irregular regimes hidden within the dynamics. These identifications not only have significant economic ramifications but also contribute to a deeper understanding of the underlying phenomena. In DS$^3$M, the architecture employs discrete latent variables to represent regimes and continuous latent variables to account for random driving factors. By melding a Recurrent Neural Network (RNN) with a nonlinear Switching State Space Model (SSSM), we manage to capture the nonlinear dependencies and irregular regime-switching behaviors, governed by a Markov chain and parameterized using multilayer perceptrons. We validate the effectiveness and regime identification capabilities of DS$^3$M through short- and long-term forecasting tests on a wide array of simulated and real-world datasets, spanning sectors such as healthcare, economics, traffic, meteorology, and energy. Experimental results reveal that DS$^3$M outperforms several state-of-the-art models in terms of forecasting accuracy, while providing meaningful regime identifications.

ds 3, latent variable, time sery, (13 more...)

arXiv.org Artificial Intelligence

2106.02329

Country:

North America > United States (0.28)
Asia > China > Zhejiang Province > Hangzhou (0.05)
Asia > Singapore > Central Region > Singapore (0.04)
(3 more...)

Genre: Research Report > Promising Solution (0.34)

Industry:

Health & Medicine (1.00)
Banking & Finance > Economy (0.69)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Undirected Networks > Markov Models (1.00)

Add feedback

Learning and Predicting Multimodal Vehicle Action Distributions in a Unified Probabilistic Model Without Labels

Richter, Charles, Barragán, Patrick R., Karaman, Sertac

arXiv.org Artificial IntelligenceDec-13-2022

We present a unified probabilistic model that learns a representative set of discrete vehicle actions and predicts the probability of each action given a particular scenario. Our model also enables us to estimate the distribution over continuous trajectories conditioned on a scenario, representing what each discrete action would look like if executed in that scenario. While our primary objective is to learn representative action sets, these capabilities combine to produce accurate multimodal trajectory predictions as a byproduct. Although our learned action representations closely resemble semantically meaningful categories (e.g., "go straight", "turn left", etc.), our method is entirely self-supervised and does not utilize any manually generated labels or categories. Our method builds upon recent advances in variational inference and deep unsupervised clustering, resulting in full distribution estimates based on deterministic model evaluations.

artificial intelligence, machine learning, prediction, (17 more...)

arXiv.org Artificial Intelligence

2212.07013

Country: Asia > Middle East > Republic of Türkiye > Karaman Province > Karaman (0.04)

Genre: Research Report (0.50)

Industry: Transportation > Ground > Road (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.84)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Clustering (0.34)

Add feedback

Towards Diverse, Relevant and Coherent Open-Domain Dialogue Generation via Hybrid Latent Variables

Sun, Bin, Li, Yitong, Mi, Fei, Wang, Weichao, Li, Yiwei, Li, Kan

arXiv.org Artificial IntelligenceDec-2-2022

Conditional variational models, using either continuous or discrete latent variables, are powerful for open-domain dialogue response generation. However, previous works show that continuous latent variables tend to reduce the coherence of generated responses. In this paper, we also found that discrete latent variables have difficulty capturing more diverse expressions. To tackle these problems, we combine the merits of both continuous and discrete latent variables and propose a Hybrid Latent Variable (HLV) method. Specifically, HLV constrains the global semantics of responses through discrete latent variables and enriches responses with continuous latent variables. Thus, we diversify the generated responses while maintaining relevance and coherence. In addition, we propose Conditional Hybrid Variational Transformer (CHVT) to construct and to utilize HLV with transformers for dialogue generation. Through fine-grained symbolic-level semantic information and additive Gaussian mixing, we construct the distribution of continuous variables, prompting the generation of diverse expressions. Meanwhile, to maintain the relevance and coherence, the discrete latent variable is optimized by self-separation training. Experimental results on two dialogue generation datasets (DailyDialog and Opensubtitles) show that CHVT is superior to traditional transformer-based variational mechanism w.r.t. diversity, relevance and coherence metrics. Moreover, we also demonstrate the benefit of applying HLV to fine-tuning two pre-trained dialogue models (PLATO and BART-base).

artificial intelligence, machine learning, natural language, (19 more...)

arXiv.org Artificial Intelligence

2212.01145

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
Asia > China > Beijing > Beijing (0.04)
Oceania > Australia > Queensland (0.04)
(2 more...)

Genre: Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning (0.88)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.34)

Add feedback

Discovering Diverse Solutions in Deep Reinforcement Learning

Osa, Takayuki, Tangkaratt, Voot, Sugiyama, Masashi

arXiv.org Artificial IntelligenceMar-11-2021

Reinforcement learning (RL) algorithms are typically limited to learning a single solution of a specified task, even though there often exists diverse solutions to a given task. Compared with learning a single solution, learning a set of diverse solutions is beneficial because diverse solutions enable robust few-shot adaptation and allow the user to select a preferred solution. Although previous studies have showed that diverse behaviors can be modeled with a policy conditioned on latent variables, an approach for modeling an infinite set of diverse solutions with continuous latent variables has not been investigated. In this study, we propose an RL method that can learn infinitely many solutions by training a policy conditioned on a continuous or discrete low-dimensional latent variable. Through continuous control tasks, we demonstrate that our method can learn diverse solutions in a data-efficient manner and that the solutions can be used for few-shot adaptation to solve unseen tasks.

continuous latent variable, latent variable, ltd3, (15 more...)

arXiv.org Artificial Intelligence

2103.07084

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Middle East > Jordan (0.04)
Oceania > Australia > New South Wales > Sydney (0.04)
Asia > Japan > Kyūshū & Okinawa > Kyūshū > Fukuoka Prefecture > Fukuoka (0.04)

Genre: Research Report > New Finding (0.68)

Industry: Education (0.93)

Technology:

Information Technology > Artificial Intelligence > Machine Learning > Reinforcement Learning (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.47)

Add feedback

Discond-VAE: Disentangling Continuous Factors from the Discrete

Choi, Jaewoong, Hwang, Geonho, Kang, Myungjoo

arXiv.org Machine LearningSep-16-2020

We propose a variant of VAE capable of disentangling both variations within each class and variations shared across all classes. To represent these generative factors of data, we introduce two sets of continuous latent variables, private variable and public variable. Our proposed framework models the private variable as a Mixture of Gaussian and the public variable as a Gaussian, respectively. Each mode of the private variable is responsible for a class of the discrete variable. Most of the previous attempts to integrate the discrete generative factors to disentanglement assume statistical independence between the continuous and discrete variables. However, this assumption does not hold in general. Our proposed model, which we call Discond-VAE, DISentangles the class-dependent CONtinuous factors from the Discrete factors by introducing the private variables. The experiments show that Discond-VAE can discover the private and public factors from data qualitatively and quantitatively.

artificial intelligence, deep learning, machine learning, (18 more...)

arXiv.org Machine Learning

2009.08039

Country:

Asia > Middle East > Jordan (0.04)
Asia > South Korea > Seoul > Seoul (0.04)

Genre: Research Report (0.82)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty (0.68)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.46)

Add feedback

Unifying and generalizing models of neural dynamics during decision-making

Zoltowski, David M., Pillow, Jonathan W., Linderman, Scott W.

arXiv.org Machine LearningJan-13-2020

An open question in systems and computational neuroscience is how neural circuits accumulate evidence towards a decision. Fitting models of decision-making theory to neural activity helps answer this question, but current approaches limit the number of these models that we can fit to neural data. Here we propose a unifying framework for modeling neural activity during decision-making tasks. The framework includes the canonical drift-diffusion model and enables extensions such as multi-dimensional accumulators, variable and collapsing boundaries, and discrete jumps. Our framework is based on constraining the parameters of recurrent state-space models, for which we introduce a scalable variational Laplace-EM inference algorithm. We applied the modeling approach to spiking responses recorded from monkey parietal cortex during two decision-making tasks. We found that a two-dimensional accumulator better captured the trial-averaged responses of a set of parietal neurons than a single accumulator model. Next, we identified a variable lower boundary in the responses of an LIP neuron during a random dot motion task.

accumulator model, boundary, decision-making, (16 more...)

arXiv.org Machine Learning

2001.04571

Country:

Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
North America > United States > California > Los Angeles County > Long Beach (0.04)

Genre: Research Report (0.82)

Industry: Health & Medicine > Therapeutic Area > Neurology (1.00)

Technology:

Information Technology > Artificial Intelligence > Cognitive Science (0.86)
Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.47)
Information Technology > Artificial Intelligence > Representation & Reasoning > Uncertainty > Bayesian Inference (0.46)
Information Technology > Artificial Intelligence > Machine Learning > Learning Graphical Models > Directed Networks > Bayesian Learning (0.46)

Add feedback